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1 Introduction 


Large scale structure (LSS) study of Big Bang cosmology tries to explain how an initially flat or 
smooth 3-dimensional surface described by the Robertson-Walker metric evolved into a wrinkled 
one. In terms of density and velocity fields, it explains how an initially homogeneous and Hubble- 
expanding mass distribution evolved into its present inhomogeneous state. It is generally believed 
that LSS was initiated by fluctuations formed at the early universe, and that the subsequent clustering 
was brought about by gravitational interaction between baryonic and dark matter (Kolb & Turner 
1989). As a result, like the physics of dynamical critical phenomena, turbulence, and multiparticle 
production in high energy collisions, problems in LSS are typical of structure formation due to 
stochastic forces and non-linear coupling (Berera & Fang 1994, Barbero et al 1996). The cosmic 
mass (or number) density distribution p{x) can be mathematically tteated as a homogeneous random 
field. 

Traditionally, the statistics, kinetics and dynamics in LSS are represented by the Fourier expan¬ 
sion of the density field, \p{k)\. For instance, the behavior of the LSS in scale space can effectively 
be described by the power spectrum of perturbations P{k) = \p{k)\^. In the case where the homo¬ 
geneous random field p{x) is Gaussian, all statistical features of p{x) can be completely determined 
by the amplitude of the Fourier coefficients. In other words, the two-point correlation function, or 
its Fourier counterpart the power spectrum, are enough to describe the formation and evolution of 
the LSS. 

However, the dynamics of LSS, such as clustering given by gravitational instability, is non¬ 
linear. Even if the field p{x) is initially Gaussian, the evolved density field will be highly non- 
Gaussian. To describe the dynamics of the LSS knowledge of the phase of the Fourier coefficients 
p{k) is essential. As is well known, it is difficult, even practically impossible, to find information 
about phase of the Fourier coefficients as soon as there is some computational noise (Farge 1992). 
This lack of information makes the description of LSS incomplete. Even in the case where the 
phases are detectable, the pictures in physical space, p{x), and the Eourier space, p{k) are separated. 
Erom the former we can only see the scales of the structures, but not the positions of the considered 
structures, and vice versa from the later. It has been felt for some time that the separate descriptions 
between Eourier (scale) and physical (position) spaces may lead to missing key physics. 

In order to resolve this problem, methods of space-scale-decomposition (SSD) which might 
provide information about the phase (or position) and scale of the considered structures have been 
developed. The possibility of simultaneously localizing in both frequency (scale) and time (position) 
is not new in physics. Anybody who listens to music knows that they, at any time, can resolve 
the frequency spectrum. The problem of how to perform this time resolution is also not new in 
physics. Wigner functions in quantum mechanics, and the Gabor transform (Eourier transform on 
finite domain) were early approaches. 

Speaking simply, SSD represents a density field as a superposition of density perturbations 
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localized in both physical and scale spaces. For instance, identification of clusters from a galaxy 
distribution by eyes is a SSD. Generally, all methods of identifying clusters and groups from sur¬ 
veys of galaxies or samples of N-body simulation are SSD. One can list several popular SSDs in 
cosmology as follows: smoothing by a window function, or filtering technique; percolation; the 
friend-to-friend algorithm; count in cells (CIC). 

A common problem of most of the above mentioned SSDs is that the bases, or representations, 
given by these methods is incomplete. Unlike the Fourier representation p(k), these SSDs lose 
information contained in the density field pir'). For insfance, one can complefely reconsfrucf fhe 
densify field p{x) by fhe Fourier coefficienls p{k), buf cannof do fhe same using window fillers, 
CIC, percolation efc. 

All fhese SSDs are, direclly or indireclly, fhe precursors fo fhe DWT (Discrefe Wavelef Trans¬ 
form). The DWT is also a SSD, buf is based on bases sefs which are orfhogonal and complete. The 
DWT is invertible and admissible making possible a complefe represenfafion of LSS wifhouf losing 
information. Unlike fhe Fourier bases (fhe frigonomefric functions) which are inherenfly nonlocal, 
fhe DWT bases have limifed spafial supporf. The DWT allows for an orfhogonal and complefe 
projeclion on modes localized in bofh physical and space spaces and makes possible a mulfiscale 
resolution. 

Moreover, fhe orfhogonal bases of fhe DWT are obfained by (space) franslafion and (scale) 
dilafion of one scale funclion (Meyer 1992, 1993; Daubechies 1992). They are self-similar. This 
franslafion-dilafion procedure allows for an optimal compromise: fhe wavelef fransform gives very 
good spafial resolufion on small scales, and very good scale resolufion on large scales. Therefore, 
fhe DWT is able fo resolve an arbifrary density field simulfaneously in terms of ifs position variable 
and ifs conjugafe counferparf in Fourier space (wavenumber or scale) up fo fhe limif of uncerfainly 
principle. 

There have been affempfs fo use fhe continuous wavelef fransform (CWT) fo analyze LSS 
(Slezak, Bijaoui & Mars 1990; Escalera & Mazure 1992; Escalera, Slezak & Mazure 1992; Mar¬ 
tinez, Paredes & Saar 1993). However, since 1992 if has become clear fhaf fhe CWT is a poor or even 
impossible mefhod fo use as a reasonable SSD. The difference befween CWT and DWT is mafhe- 
mafically essenfial, unlike fhe case for fhe Eourier fransform, for which fhe confinuous-discrefe 
difference is only fechnical (Yamada & Ohkifani 1991; Earge 1992; Greiner, Eipa & Carrufhers 
1995). 

These properties of fhe DWT make if unique among fhe various SSD mefhods. One can expecf 
fhaf some sfafisfical and dynamical fealures of ESS can easily, and in fad only, be described by fhe 
DWT represenfafion. The DWT sfudy of ESS now is in a very preliminary sfage. Neverfheless, 
resulfs have shown fhaf fhe DWT can reveal aspecfs of ESS behavior which have nol been seen by 
fradifional mefhods (Pando & Pang 1995, 1996a, 1996b; Huang el al 1996). These DWT-represenfed 
fealures have also been found fo be effective for discriminaling among models of ESS formalion. 

As we will show fhe DWT opens a new dimension in fhe sfudy of fhe slalisfics and dynamics 
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oftheLSS. 


2 Discrete wavelet transform of density fields 

The real difference in using the discrete wavelet transform (DWT) as compared with say, Fourier 
techniques, comes when one deals with samples of finite extent. Since both the bases are complete, 
the information revealed by both these techniques is equivalent when the function is continuous and 
the limits of the respective inner products or sums can be calculated at infinity. However, once the 
function is not continuous (or, rather, not continuously sampled) or the sum cannot be calculated to 
infinity, the two representations reveal different aspects of the distribution. But this is always the 
case when dealing with either observational data or simulated data. Limited resolution always forces 
one to sample a function at intervals, and no sum can be calculated to infinity. Unlike the Fourier 
transform, the difference between the continuous and discrete transform is not merely technical 
(Yamada & Ohkitani 1991; Farge 1992; Greiner, Lipa & Carruthers 1995). One does not merely 
replace the limit of a sum with infinity and then take a limit. By construction, the discrete and 
continuous wavelet transform are quite different. 


2.1 An example: Haar wavelet 


Let us first consider a 1 -dimensional (1-D) density field p{x) over a range 0 < x < L. If is nof 
difficulf fo exfend fhe 1-D analysis fo higher dimensions. If is convenienf fo use fhe densify confrasf 
defined byQ 

p{x) - p 


e{x) = 


( 2 . 1 ) 


where p is fhe mean densify in fhis field. Acfually, observed dafa and simulafed samples can only 
provide densify disfribufions wifh finife resolufion, say Ax. Hence, wifhouf loss of information, 
e(x) can be expressed as a histogram wifh 2^ bins (Figure 1), where J is faken large enough so fhaf 
L/2'^ < Ax i.e. 

J < mod(| In Ax|/In2) + 1, (2.2) 


The hisfogram is labeled so fhaf fhe 2"^ bins are designated by an infeger, I, running from 
0 < / < 2*^ - 1. Bin ( covers a range from L12 fo L{1 + 1)2 ^. The samples can fully be 
described by fhe 2'^ ej^i (0 < Z < 2*^ — 1) defined by 


eji = e(x), LZ2“'^ < x < L{1 + 1)2"‘^. (2.3) 


Using ej^i, we can rewrite e(x) as 


2-^-1 


:(x) = e-^(x) = e,pi(l)fi{x) 
1=0 


(2.4) 


‘‘We do not as usual denote the density contrast by 5 , because of possible confusion with the Kronecker <5 symbol 
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(2.5) 


where the (/)n{x) are given by 



1 for < X < L(/+ 1)2“'^ 

0 otherwise. 


Actually, (t>^i{x) is a top-hat window function on resolution scale Ljl'^ and at position L12' < 

X < L(/ + 1)2-'^. 

Expression (2.5) can be generalized to top-hat window functions on different scales. We first 
define a fop-haf scaling funclion as 




1 for 0 < ry < 1 
0 ofherwise. 


( 2 . 6 ) 


Thus, one can consfrucf a sef of fop-haf window funcfions by a franslafion and dilation of fhe scaling 
function (2.6) as [| 

</.f;(x) = </.^(2%/L-/), (2.7) 

where j, I are integers, and j > 0, 0 < ( < 2-^ — 1. Obviously, (j)^i{x) is a normalized fop-haf 
window function on scale L/2-^ and af the position L12' ^ < X < L{1 -|- 1)2 A 4>^i{x) is called the 
mother function. 

If we smooth the density contrast e(x) by the mother function (t>^i{x) on scale j = J — 1, we 
have an approximate expression of e(x) as 

(-'^~^{x) = ej_i,i(/>:^_i/x) (2.8) 

1=0 

where the mother function coefficients (MFCs) are given by 

ej-i,z = + ej,2/+i)- (2-9) 

Similarly, we can continue this procedure to find smoofhed disfribufions e-^ (x) on scales j = J — 
2, J — 3,... as 

2t-l-l 

= Y ( 2 - 10 ) 

1=0 

where fhe j-fh MFCs can be found from {j + l)-lh MFCs by 

= 2(^i+h2i + ej+i,2i+i)- (2.11) 

These resulfs are nofhing new. In facl fhey are window smoofhing on a scale-by-scale basis as 
shown in Figure 1. The ej^i confain less information fhan In order nof fo lose any informafion 

as a resulf of the smoothing, we should calculate the difference between the smoothed distributions 
on succeeding scales, say e'^'''^(x) — e-^ (x). Figure 1 also plots these differences. 

^Actually, eq.(2.7) is not a dilation of eq.(2.6), but a compression. We use the word ’’dilation”, because in the wavelet 
literature the factor 1/2“^ is called the scale dilation parameter, regardless if it is larger than 1. 
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Figure 1: A Haar wavelet multiresolution decomposition. The original sample is shown by the top 
left figure. Its resolution is j = 4. The left column is the reconstructed distributions of the MFCs 
ej^i on scales j = 3, 2, 1 ■ The right column of histograms show the distribution of FFCs, ej^i on the 
corresponding scale j. 


To describe this difference, we define a wavelef as 

1 forO<? 7 <l /2 

-1 for 1/2 < 7? <1 (2.12) 

0 ofherwise. 

This is fhe Haar wavelef, and if is fhe reason thaf we used a superscripf H on fhe funcfions cj){x) and 
■(/(x). As wifh fhe mofher funcfions, one can consfrucf a sef of by dilating and franslafing 

eq.(2.10) as 

'ip^^iix) =ip^{2^x/L-l) 

1 for Ll2-i <x<L{l + 1/2) 2-^ 

-1 for L{1 + 1/2)2-^' <x<L{l + l) 2-^ ^ ‘ ’ 

0 ofherwise. 

is called fhe falher function (Meyer 1993) [| 

®In some of the literature ipf^i (x) is called the mother function, while (pf i (x) the father function. This is unfortunate 
hut we hope this confusion will not lead to too many misunderstandings. 
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From eqs.(2.7) and (2.13), we have 


Eq.(2.4) ean be rewritten as 

c-'ix) = ej-u'Pj-iAx) + Ij-u’Pj-i/x) 

where we used eqs.(2.8) and (2.9), and 

ej-i,fc = 2(^42/ — ej,2«+i)- 


(2.14) 


(2.15) 


(2.16) 


The ej^k are ealled father funetion eoeffieients (FFC). Thus, the differenee between ej^ and ^{x) 
is given by an FFC term as 

2''-l-l 


e-^ \x) = ^ ej-iAi^f_^i{x). 

1=0 


(2.17) 


Eq.(2.15) beeomes then 

(■^(x) = €'^-\x) + e'^-\x) 


(2.18) 


The term ^(x) eontains all information lost during the window smoothing from order J to J — 1. 
While not immediately obvious, these two funetions, the mother funetions and the father funetions, 
together form a eompaetly supported orthogonal bases. 


2.2 Multiresolution analysis 


Eq.(2.18) shows that the J-th distribution ean be resolved into a (J — l)-th smoothed distribution 
and a term given by (J — l)-th EECs. We ean repeat this proeedure, i.e. resolving the (J — l)-th 
distribution into (J — 2)-th smoothed distribution and a term eontaining the (J — 2)-th EECs, and 
so on. The original distribution e(x) ean then be resolved seale by seale as 


€'^(x) = ^{x) + e^ ^(x) 

= e'^~‘^{x) + e^~‘^{x) + e^~^{x) 

= e°(x) + e^(x) + ... + e'^“^(x) + ^"^“^(x) 


where 


and 


2-’-l 


e\x) = e°(x) + e°(x) + ... + ^(x) = ^ ej,i<Pf,iix) 

1=0 




1=0 
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( 2 . 20 ) 


( 2 . 21 ) 



Eq.(2.19) is a wavelet multiseale deeomposition (or wavelet multiresolution analysis) of e(a:). 
As emphasized before, unlike the window funetion deeomposition (e-^ (x)), the wavelet multiseale 
deeomposition (e-^ (x) and e^{x)) does not lose information. In other words, one eannot reeonstruet 
e{x) from windowed eomponents e^{x), (J = 0,1,... J — 1), but we are able to reeonstruet the 
original distribution from the “differenee” funetions e^{x), {j = 0,1,... J — 1), and e°(x). Here 
e°(x) = eo,o<?i>^o(*)’ ^o,o is simply the mean density of the distribution e(x) in the range [0, L]. 
Using (2.1), we have e°(x) = 0. 

Moreover, the father funetions V'j^;(x) are orthogonal with respeet to both indexes j and I, i.e. 

^ 'ipfj>ix)'iljf^iix)dx=(^^^Sfj6i:^i ( 2 . 22 ) 

where dj/j is the Kroneeker delta. For a given j, '4’fi{x) are also orthogonal to mother funetions 
(l)f,^i{x) with / < j, i.e. 

^ 4>f,v{x)'4)f^i{x)dx = d, if /</ (2.23) 

The FFCs in eq.(2.21) ean be found by 


The last line of Eq.(2.19) is then 

e(x) = e'^(x) 



= e°(x) + e°(x) + ... + ^(x) + e‘^ ^(x) 


Z^jr=0 2 ^ 1=0 




(2.24) 


(2.25) 


The FFCs provide a eomplete representation of e(x) whieh we will eall the Haar representation. 


2.3 Compactly supported orthogonal bases 

The Haar representation suffers from the drawbaek that the /j^/x) are not loealized in Fourier spaee. 
As was mentioned in §2, an adequate spaee-seale deeomposition should be loealized in both physieal 
and seale (Fourier) spaee. The top-hat window funetion (2.6) and the wavelet (2.10) eannot meet 
this eondition, beeause they are diseontinuous. In the mid-80’s to early 90’s a great deal of work was 
done in trying to find a eontinuous bases that was well loealized in Fourier spaee (Daubeehies 1988, 
Meyer 1988, Mallat 1989, Mallat & Zhong 1990,). Speeifieally, Daubeehies (1988) eonstrueted 
several families of wavelets and sealing funetions whieh are orthogonal, have eompaet support and 
are eontinuous. 

In order to eonstruet a eompaetly supported diserete wavelets basis the following two reeursive 
equations were involved (Daubeehies 1992, Meyer 1993). 

= El ai(l)(,2r] - 1) 
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It is easy to show that the Haar scaling (2.6) and wavelet (2.12) satisfy eq.(2.26) only if the coeffi¬ 
cients ao = ai = bo = —bi = 1 are. nonvanishing. 

Directly integrating the first equation in (2.26) it follows that 


J2ai = 2. 

i 

Requiring orthonormality for (j){x) with respect to discrete integer translations, i.e. 

roo 

/ 4>{r] - m)4>{r])dr] = 5mfi, 


(2.27) 


(2.28) 


we have that 

^ aiai + 2m = (2.29) 

i 

The wavelet ipif]) has to qualify as a “difference” function, i.e. it is admissible. We have then 


r-l-co 


'ip{rj)dr] = 0, 


(2.30) 


so we need 


The multiresolution analysis requires that 


T.i>‘ = o 





l)drj = 0. 


(2.31) 


(2.32) 


So one has 

h = (-l)'ai_z. (2.33) 

After the Haar wavelet, the simplest solution of the recursive equations (2.26) with conditions (2.27), 
(2.29) is 

ao = (1 + v/3)/4, ai = (3 + \/3)/4, aa = (3 - \/3)/4, ag = (1 - \/3)/4. (2.34) 


This is called the Daubechies 4 wavelet (D4) and is plotted in Figure 2. In general, the more coeffi¬ 
cients ai that are nonvanishing, the wider the compact support is, while the wavelet itself becomes 
smoother (Daubechies 1992; Chui 1992; Kaiser, 1994) 


2.4 DWT decomposition 


To conduct a DWT analysis, one first constructs the DWT bases 
and Tp{x) as 





1/2 

(p{2^x/L - 


by dilation and translation of (/)(x) 
1) (2.35) 
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Figure 2: The Daubeehies 4 wavelet, ipiv)’ determined by (2.34). The wavelet is plotted for presen¬ 
tation purposes only, so the units are arbitrary. 


and 

j V'(2^x/L-/), (2.36) 

where and with integer j and I are the father funetions and mother funetions, respeetively. 
Different from eqs.(2.7) and (2.13), eqs.(2.35) and (2.36) inelude a normalization faetor (2'^7L)^/^. 
The set of V'jv and (j)o,m{x) with 0 < j < oo and — oo < l,m < oo form a eomplete, orthonormal 
basis in the spaee of funetions with period length L. 

To subjeet a finite sample in region L to a DWT expansion, we introduee an auxiliary density 
distribution p{x) whieh is an L periodie funetion defined on spaee — oo < x < oo. Using fhe 
eomplefe, orfhonormal basis, V’yz and (pj.mix), the density distribution p{x) ean be expanded as 

OO OO OO 

pA) = ^0 (po (x) + E E cyzV'iXx) (2.37) 

m=—oo j=0l=—oo 


where the eoeffieients co,m and cy; are ealeulated by the inner produets 


Co,m — 


p{x)(j)o,m{x)dx, 


and 


= 


p{x)'ipj^i{x)dx. 


Beeause all bases funetions '4)j^i{x)dx = 0 [eq.(2.30)], eq.(2.37) ean be rewritten as 


loo 

oo 


PA) = E Co,m</'0,m(x) +p E E ^j,i'4’j,i{x) 

m=—oo j=0l=—oo 


(2.38) 

(2.39) 


(2.40) 


11 




where p is the mean density, and 


^hi = 


e{x)^j^i{x)dx 


(2.41) 


where e(x) = {p{x) — p)/pis the density eontrast as eq.(2.1). 

By definition, p{x) = p{x + mL) for integers m, eq.(2.38) ean be rewritten as 


/ OO 

p{x + mL)(j)o^rnix)dx. 

-OO 


(2.42) 


Using eq.(2.35), we have then 


/ OO 

p{x + mL)L~^^'^(j){x/L — m)dx 

-OO 

/ OO 

p{x')L~^/‘^(l){x' / L)dx' 

-OO 

/ OO 

p{x')(l)Qfl{x')dx' = co,o 

-OO 

where x' = x + mL. The eoefficients co,m are independent of m. Using the property of 
of unity” of the sealing function (Daubechies 1992) 


^ 4>{ri + m) = l, 


(2.43) 
‘‘partition 

(2.44) 


from eq.(2.43) one has 


C0,0 = 


p{x)L ^^‘^(j){x/L)dx 

' —OO 
OO i 

= ^ / p{x + mL)L~^'^(l){x/L + m)dx 

-oo4o 

fL “ 

= / p{x)y^ (j){x/L + m)dx 

40 2 ^ 

rL 

= / p{x)dx. 

20 

The first term in the expansion (2.40) becomes 

OO OO 

X! Co,m0o,m(a;) = ^ 4>(x/L + m) = L~^ J^ p{x)da 


(2.45) 


(2.46) 


m=—oo 


m=—oo 


This term is the mean density p. From eq.(2.40) we have finally 

p{x) - p 


e(x) = 


P 


= Y.Y. 


X 


(2.47) 


j=0 l=—oo 
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where the father funetion eoeffieients are given by eq.(2.41). Eq.(2.47) has the same terms as 
eq.(2.25), and is a multiresolution analysis with respeet to the orthogonal bases Tpj^i{x). 

While the multiresolution analysis eq.(2.19) still holds, eqs.(2.20) and (2.21) should be replaeed 


by 


and 


2J-1 


'(x) = €°(x) + e^{x)... + ^(x) = ^ ej^i4>j^i{x) 


1=0 


2t-l 


1=0 


(2.48) 


(2.49) 


2.5 Relationship between Fourier and DWT expansion 

In terms of the Fourier transform, e(x) is expressed as 


n=—oo 


i^'KnxjL 


with the eoeffieients eomputed by 


= i e(x)e-*2™^/^dx. 


L 


(2.50) 


(2.51) 


'0 


Since both the DWT and Fourier bases sets are complete, a function may be represented by ei¬ 
ther bases and there is thus a relationship between the FFCs and the Fourier coefficients. Substituting 
expansion (2.50) into eq.(2.41), yields 


OO n (y^ 'DO 

enipjA-^) 

n=—oo n=—oo 

where 'ipj^i{n) is the Fourier transform of il)j^i{x), i.e. 

/ OO 

-OO 

Using eq.(2.36), eq.(2.52) gives 

°° / oi \ roo 

^ (t) ^”7 e*2Wi^(2ia;/L - Odx . 


(2.52) 


^3,1 = 


(2.53) 


(2.54) 


Defining variable rj = 2^x/L — I, one finds 


2^ 


- 1/2 


,.= E lx 


i27Tnll2^ I i27rnr)l2^ 

Cri6 / C 


'43{ri)drj 


or 


3,1 = J2 


OO / 2 j\ 1/2 


L 




(2.55) 


(2.56) 
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where ^/;(n) is the Fourier transform of the wavelet ipif]) 

/ OO 

(2.57) 

-OO 

Eq.(2.56) is the expression of the FFCs in terms of the Fourier eoefficients. 

From expansions (2.51) and (2.49), one can also express the Fourier coefficient, in terms of 
FFCs as (see Appendix A) 

^ OO 2t-l 

n^O (2.58) 

^ j=0 1=0 

or 

OO 2^-1 ... s 1/2 

E" = EEUj7 (2.59) 

j=0 1=0 ^ ^ 

Eq.(2.58) and (2.59) show how the Eourier coefficients are determined by a DWT analysis. 

2.6 Comparison with the Fourier transform 

The difference between the Eourier transform and the DWT can easily be seen in phase space {x,k). 
According to the uncertainty principle, each mode of a complete, orthogonal bases set corresponds to 
a ’’element” with size Ax and Ak, and area AxAk ~ 27r in phase space. Eor the Eourier transform, 
the ’’elements” are taken to be Ak = 0 and Ax = oo, while for the DWT both Ak and Ax are finite, 
and AxAk ~ 27r (Eigure 3). 

The representation by the Eourier bases, i.e., the trigonometric functions, is delocalized (Ax = 
(X)). The DWT resolves the distribution e(x) simultaneously in terms of its standard variable (say 
space) and its conjugate counterpart in Eourier space (wavenumber or scale) up to the limit of un¬ 
certainty principle. 

This doesn’t mean that the Eourier transform loses information about e(x) but rather that the 
information on the position is spread out. Eor instance, let us consider a distribution that contains 
a few clumps of scale d. The positions of the clumps are related to the phases of all the Eourier 
coefficients k < ‘I'k/ d. There is no way to find fhe posifions of fhe clumps from a finite number of 
Eourier coefficienls. The only solufion would be fo reconsfrucf e(x) from all fhe Eourier coefficienls. 
If some of fhe ’’clumps” are due fo experimenfal errors, we will nol be able fo filter fhem ouf because 
fhey have affecled all fhe Eourier coefficienls. 

On fhe olher hand, fhe DWT keeps fhe locality presenf in fhe disfribufion and allows for fhe 
local reconsfrucfion of a disfribufion. If is Ihen possible fo reconsfrucf only a porfion of if. There 
is a relafionship befween fhe local behavior of a disfribufion and fhe local behavior of ifs EECs. 
Eor insfance, if a disfribufion e(x) is locally smoofh, fhe corresponding EECs will remain small, 
and if e(x) confains a clump, fhen in ifs vicinify fhe EECs amplifude will increase drasfically. To 
reconsfrucf a porfion of fhe disfribufion, if is only necessary fo consider fhe EECs belonging fo fhe 
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Figure 3: “Element” of different transforms in phase spaee (x, k). A. DWT, both AA: and Ax are 
finite, and AxAk ~ 2tt. B. Fourier transform, the ’’elements” are Ak = 0 and Ax = oo. C. Gabor 
transform, here Ak = Ax = eonstant, regardless the seale. 


eorresponding subdomain of the wavelet spaee {j, /). If the FFCs are oeeasionally subjeet to errors, 
this will only affeet the reeonstrueted distribution loeally near the flawed positions. Furthermore, 
the Fourier transform is also partieularly sensitive to phase errors due to the alternating eharaeter of 
the trigonometrie series. This is not the ease for the DWT. 

Early approaehes to finding information on the loeations in the Fourier transform sehemes 
were given by the Wigner funetion in quantum meehanies and the Gabor transform. The differenee 
between the Gabor transform and the DWT ean also seen in Figure 3. The orthogonal basis of the 
DWT are obtained by (spaee) translation and (seale) dilation of one wavelet. The DWT transform 
gives very good spatial resolution on small seales, and very good seale resolution on large seales. 
Gabor’s windowed Fourier transform is based on a family of trigonometrie funetions exhibiting 
inereasingly many oseillations in a window of eonstant size. In this ease the spatial resolution on 
small seales and the range on large seales are limited by the size of the window. 

The relation between the diserete and the eontinuous Fourier transforms when e(x) is viewed 
as a eontinuous distribution sampled on an interval A is e(n) = Ae^ where e(n) is the usual eon¬ 
tinuous Fourier transform and in is the diserete Fourier transform. However, no sueh relationship 
exists between the diserete wavelet transform, sueh as the D4 wavelet, and the eontinuous wavelet 
transform (CWT). The D4 wavelet has no eontinuous analog. Eaeh type of eomplete, orthogonal 
wavelet bases must be eonstrueted from serateh. In general, CWTs form an over eomplete bases, 
i.e. they are highly redundant. As a eonsequenee, CWT eoeffieients of a random sample show A 
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correlation that is not in the sample itself but is given by the over complete bases. Since the DWT 
is complete and orthogonal, this is not a problem in using the DWT. The point is that the difference 
between the DWT and the CWT is essential, and by construction, these two bases are very different. 

The DWT is not intended to replace the Fourier transform, which remains very appropriate in 
the study of all topics where there is no need for local information. 

3 A DWT Estimation of the Probability Distribution Function 

3.1 One-point distribution of FFCs 

As has been emphasized in §1, the statistical features of a non-linearly evolved density field can 
not be completely described by the power spectrum or the two-point correlation function. For a 
complete description of salient statistical features, the probability distribution functions (PDF) of 
the density field are required. 

There is, however, a very real problem in determining the PDF due mainly to the central limit 
theorem of random fields (Adler 1981). If fhe universe consists of a large number of randomly 
distributed clumps with a non-Gaussian one-point function, eq.(2.51) shows that for large L the 
Fourier amplitudes, e^, are given by a superposition of a large number of non-Gaussian clumps. 
According to the central limit theorem, the distribution of will be Gaussian when the total number 
of clumps is large. Thus, in general, the statistical features of the clumps can not be seen from 
the one-point distribution of the Fourier modes, e^, even if the PDF function of clumps is highly 
non-Gaussian. If the clumps are not distributed independently, but are correlated, the central limit 
theorem still holds if the two-point correlation function of the clumps approaches zero sufficiently 
fast (Fan & Bardeen 1995). 

On the other hand, the father functions, V'j ;(x), are localized. If the scale of the clump is d, 
eq.(2.41) shows that the FFC, ;, with j = \og 2 {L/d), is determined only by the density field 
in a range containing no more than one clump. That is, for scale j the FFCs are not given by a 
superposition of a large number of the clumps, but determined by at most one of them. Thus, the 
one point distribution of the FFC, avoids the restriction of the central limit theorem, and is able 
to detect the PDF related to the clumps, regardless the total number of the clumps in the sample 
being considered. 

This point can also be shown from the orthonormal bases being used for the expansion of the 
density field. A key condition needed for the central limit theorem to hold is that the modulus of the 
bases be less than C/ vT, where L is the size of the sample and C is a constant (Ivanonv & Leonenko 
1989). Obviously, all Fourier-related orthonormal bases satisfy this condition because the Fourier 
orthonormal bases in 1-dimension are such that (l/\/Z)| sin/cx| < C / \/L and (1/\/L)| cos/cx| < 
C/^/L, and C is independent of coordinates in both physical space x and scale space k. On the 
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other hand, the father funetions (2.36) have 





(3.1) 


because the magnitude of the basic wavelet '4>{x) is of the order 1 [eq.(3.26)]. The condition 
< Cl\fL, will no longer hold for a constant C independent of scale variable j. 


3.2 “Ensemble” of FFCs 

In cosmology, no ensemble of cosmic density fields exists, and at most only one realization, i.e. 
the observed density distribution, is available. In order to have reasonable statistics, the cosmic 
density field is usually assumed fo be ergodic: fhe average over an ensemble is equal fo fhe spafial 
average faken over one realizafion. If is sometimes called fhe “fair sample hypofhesis” in LSS sfudy 
(Peebles 1980). A homogeneous Gaussian field wifh continuous specfrum is cerfainly ergodic (Adler 
1981). In some non-Gaussian cases, such as homogeneous and isofropic furbulence (Vanmarke, 
1983), ergodicify also approximafely holds. Roughly, fhe ergodic hypofhesis is reasonable if spafial 
correlafions are decreasing sufficienfly rapidly wifh increasing separafion. The volumes separafed 
wifh disfances larger fhan fhe correlafion lengfh are approximafely sfafisfically independenf. Even 
for one realizafion of e(x), FFCs af differenf locafions I can be freafed as resulfs from sfafisfically 
independenf realizafions. Thus, fhe values of ej^i on differenf I form an ensemble of fhe FFCs on 
scale j. 

This resulf can be more clearly seen from eq.(2.56). For many wavelefs, fheir Fourier frans- 
forms V’(n) are non-zero (localized) in fwo narrow ranges cenfered af n = ±np. For insfance, 
fhe Baffle-Femarie wavelef Up = ±1, and fhe Daubechies 4 wavelef Up ~ ±1.9. The sum over 
n in eq.(2.56) is only faken over fhe fwo ranges {up — 0.5Anp)2-^ < n < (up + 0.5Anp)2^ and 
— (np ± 0.5Anp)2-^ < n < —{up — 0.5Anp)2-^, where Anp is fhe widfh of fhe non-zero ranges of 
'ip{n). Eq.(2.56) can fhen be approximafely rewriffen as 

r X 1/2 {np+0.5Anp)2^ 

n={np—0.bAnp)2^ 
p\l/2 {np+0.5Anp)2i 

\en\cos{9^ + 9n + 27ml/2^) (3.2) 

n={np—0.5Anp)2^ 

where we have used 'ip{—np) = 'il)*{np) and e_n = because both V'(x) and e{x) are real. 9^, 9n 
in eq.(9) are the phases of ip{np) and Cn, respectively. 

As we pointed out in §3.1, is Gaussian even when the clumps are non-Gaussian. For a 
homogeneous random field, fhe phase of e„, i.e. 9n, should be uniformly randomly disfribufed and 
from eq.(3.2), fhe probabilify disfribufion of ij^i is independenf of 1. Thus, each FFC is a realization 
of the I — independent stochastic variable ij^i. The FFCs, ij^i, on scale j form an ensemble with 2^ 




17 



realizations. The statistics with respect to the one-point distribution of FFCs ij^i should be equal to 
the results of the ensemble statistics. The goodness of this estimation can be measured by the Large 
Number Theorem, that is, the relative error is about Simply stated, when the “fair sample 

hypothesis” holds, the one-point distribution of FFCs from (observed) one-realization will be a fair 
estimate of the PDF of the cosmic density field. 

3.3 Scale mixing 

The PDF is sometimes measured by the count in cell (CIC) method. The CIC detects the one-point 
distribution of the density field in given cubical cells wifh side d or Gaussian spheres wifh radius 
Rg- It is generally believed that the CIC one-point distribution of window d is dominated by the 
density fluctuations on scale ~ 2d or Rq. 

Actually, the CIC analysis is essentially the same as the MFC. The window function of the 
CIC corresponds to the mother function, and the count to the amplitude of the MFCs. The one-point 
distribution given by the CIC have similar properties as the MFC one-point distribution. A problem 
with the MFC one-point distributions is scale mixing. Even though the mother functions of the 
DWT, (j)j^i{x), are localized in spatial space they are not orthogonal with respect to the scale index j, 
i.e. not localized in Fourier space. The MFCs, (l)j^i{x), are dependent on perturbations on all scales 
larger than L/2L Thus, if the clumps are multiscaled, the MFCs will also be Gaussian as required 
by the central limit theorem and the MFC one-point distribution may miss the clumps. Similarly, 
the CIC is scale mixed, and not suitable for studying the scale dependence (or spectrum) of various 
statistical measures. 

There are more problem related to the cubic cell CIC. The cubic cell window is just the Haar 
wavelet or Daubechies 2 (D2) [see eq.(2.5)]. It is very well known that among Daubechies wavelets, 
only D2 is not localized in Fourier space because the Fourier transform of its wavelet (2.12) is 

2 

V^^(n) = —[sin(7rn) — f cos(7rn)] sin^(7rn/2) (3.3) 

7rn 

When n <C 1, {n) ~ — f(7r/2)n. Therefore, eq.(2.56) gives 

V = E fl) „ > » (3.4) 

n<2t \ ^ / 

Eq.(3.4) shows that the large scale (small n) perturbations will significantly contribute to, and even 
dominate the ; if lim^^o ne is larger than a non-zero constant. So even the EEC one-point distri¬ 
butions of the D2 wavelet are not a good way of measuring the PDE. 

3.4 Spectrum of FFC cumulants 

Since EEC one-point distributions effectively allow for scale decomposition, one can easily calculate 
the spectrum of the EEC moments or cumulants on any order as follows. 
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The second order cumulant is given by 


^ 2J-1 
1=0 

where ey; is the average of ey; over 1. a‘j is, in fact, the power spectrum with respect to the modes 
ofDWT (see, §4.1). 

The third and fourth orders are 


^ 2J-1 

^ ~ ^j,l) ’ 

1=0 

(3.6) 

2t-l 

1=0 

(3.7) 


^ 2^ 

and will measure the spectra of skewness and kurtosis, respectively (see, §5.1). 
Generally, the spectrum of n-th order moment is defined as 




2 ^ 


2^-1 

/=0 


(3.8) 


The definitions (3.5) - (3.8) show that the numerical work of calculating higher order moments 
(cumulant) is not any more difficult than calculating the second order moments. Generally, the 
calculation of third and higher order correlations of large scale structure samples is very strenuous 
work. But the numerical work involved in calculating the DWT is not more difficult than the FFT, 
and can be faster. The FFT requires ~ N Log N calculations, while the DWT, using a “pyramid” 
scheme, requires only order N calculations (Press et al. 1992). 


4 Local power spectrum 

4.1 Power Spectrum with respect to DWT basis 

For the Fourier expansion (2.50) and (2.51) Parseval’s theorem is 

CXD 

e{x)\^dx= ^ |enp, (4.1) 

n=—oo 

which shows that the perturbations can be decomposed into domains, n, by the orthonormal Fourier 
basis functions. The power spectrum of perturbations on length scale L/n is then defined as 

P{n) = |enp. (4.2) 

Similarly, fhe Parseval fheorem for fhe expansion (2.49) can be shown fo be (see Appendix B) 




Comparing eqs.(3.3) and (3.1), one can relate the term /L to the power of perturbations 

on length scale L/2-^, and the term \ ej^i\‘^/L to the power of the perturbation on scale L / 2^ at position 
lL/2^. The spectrum with respect to the DWT bases can be defined as 


, 2^-1 


1=0 


The DWT spectrum should be defined as the variance of the FFCs, i.e., 


^ 2^-1 


PJ" = i E (V - Hif' 


1=0 


(4.4) 


(4.5) 


Because the mean of the FFCs, ij^i, over I is zero [eq.(2.56)], Pj should be equal to Comparing 
with eq.(3.5), we have cj| = (L/2'^ )PJ“'’ . 


4.2 Relationship between Fourier and DWT Spectra 


The relationship between the spectra of the Fourier expansion (4.2) and the DWT (4.4) or (4.5) can 
be found from eq.(3.2), which shows that the FFCs on scale j are mainly determined by the Fourier 
components e^, with n centered at 

n = ±np2\ (4.6) 

where |np| are the positions of the peaks of ^/)(n). Thus, from eqs.(4.4), (4.5) and (4.6), we have 


P{n)j ~ 


1 


—2 TDvar 


^ 2^+1 An^ 






where P{n)j is the average of Fourier spectrum on the scale j given by 

.. (np+0.5Anp)2.’ 

P{n)j = P{n). 

2^ An„ ^, 

^ n={np—0.6Anp)2^ 


(4.7) 


(4.8) 


Eqs. (4.7) and (4.8) provide the basic way of detecting the Fourier power spectrum by a DWT 
analysis. 

From eqs. (4.8) and (4.6), one has 


log P{k)j = log Pj - (log 2)j + A, (4.9) 

and 

log A: = (log2)j — logL/27r + P. (4.10) 

The factors A and B are given by 

A = -log(2Anp|'0(np)p) (4.11) 
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and 

B = log Up. (4.12) 

The constants A and B depend on the basic wavelet being used in the DWT analysis. In the 
case of the D4 wavelet, A = 0.602, and B = 0.270. 

Eqs.(4.9) - (4.12) provide the way of directly transferring a DWT spectrum Pj or into the 
corresponding Fourier spectrum P{k)j and vice versa. 

4.3 Finite size and complex geometry of samples 

It is well known that a difficulty in spectrum detection comes from the finite size of the sample which 
leads to uncertainty in the mean density of the objects being considered. The classical spectrum 
estimator, i.e., the Fourier transform of the two-point correlation function depends essentially on 
the mean density p. A two-point correlation analysis cannot detect any correlations with amplitude 
comparable to the uncertainty of the mean density. If we determine the spectrum via the two-point 
correlation function, the uncertainty in p leads to uncertainties on all scales in which the correlation 
amplitude is comparable to the uncertainty in the mean density. The problem of the uncertainty in 
the mean density is more severe for the study of high redshift objects because the mean density of 
these objects generally is redshift-dependent. 

The CIC or Fourier transform on a finite domain have been used to avoid this difficulty. The 
CIC detects the variance of density fluctuations in windows of a cubical cell with side I or Gaussian 
sphere with radius Rg- The behavior of the perturbations on scales larger than the size of a sample 
is assumed not to play an important role. This reduces the uncertainties caused by a poor knowledge 
of long wavelength perturbations and by the finite size of the observational samples. It is believed 
that the variance in cell I is mainly dominated by the perturbation on scale ~ 21 or Rq. Therefore, 
the variances are considered to be a measure of the power spectrum on scale I (Ffstathiou et al 
1990, Kaiser & Peacock 1991). Additionally, the Fourier transform on a finite domain (the Gabor 
transform, for instance) can also avoid the difficulty of the infinity of x. 

As was mentioned in §3.3, the problem with the CIC statistic is that the variances obtained 
from the decomposition of cells with different size d are not independent. While it is still possible 
to reconstruct the power spectrum by the scale mixing leads to uncertainty. The scale mixing 

becomes a serious problem when the power law spectrum has a negative index. Moreover, the 
resulting errors are not easy to interpret because the errors for different d are also not independent. 

The DWT provides a method to solve this problem. Actually, this problem is the same as trying 
to detect a local (finite range) spectrum. One of the motivations of developing DWT was to measure 
local spectra (Yamada and Ohkitani 1991, Farge 1992). Because the FFCs are localized, one can 
calculate local power spectrum using the FFCs related to the finite range being considered. The 
FFCs are determined by the difference of the MFCs in a localized neighborhood, i.e., by measuring 
the differences between the local mean densities. As a consequence, the mean density over length 
scales larger than the sample’s size is not necessary in calculating the FFCs on scales equal to or less 
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than the size of data. The influenee of the uneertainty of the mean density is signifieantly redueed 
by the FFC speetrum deteetion. 

To avoid the diffieulty of finite sized samples, speeifie boundary eonditions are sometimes se- 
leeted. For instanee, N-body simulations always assume a periodie eondition. Yet, the ehoiee of 
boundary eonditions may affeet the speetrum deteetion. Again, father funetions com- 

paetly supported, and the FFCs that are uneertain due to the boundary eonditions are only the eoef- 
fieients at the boundary, V’yzi and 'ipj,i 2 ^ where li and I 2 are the positions of the 1-D boundary. The 
uneertainty due to the ehoiee of boundary eonditions will also be suppressed by the wavelet SSD. 
We illustrate this point by using different boundary eonditions to find the speetrum. Figure 4 shows 
the loeal speetrum of sample in a finite area L with 512 bins, with boundary eonditions A) periodie 
outside this range; B) zero outside this range. Exeluding a small effeet on the largest seale (i.e. the 




Figure 4: Density distributions generated from speetrum P{k) = /c/(l + 10^/c^)ina range of 
L = 512 bins, where k = 27rn/L, and n is integer. The boundary eonditions outside the 512 bin 
area are taken to be A) periodie, B) zero. C.) and D.) are the speetrum reeonstruetion for A.) and B.) 
respeetively. 


More preeisely, the effeet of boundary eonditions on the FFCs should be deseribed by the so- 
ealled “influenee eone” whieh eonsisfs of fhe spatial supporf of all dilafed falher funelions. For 
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instance, if 'ipji{x) is well-localized in the space interval Ax for j = 0, the influence cone centered 
at xo will be defined by x G [xq — (Ax/2'^+^), xq -h (Ax/2-^+^)]. Namely, the FFCs corresponding 
to the positions with distance larger than (Ax/2'^'''^) from the boundary will not be affected by the 
selection of boundary conditions. 

This result is also useful in solving the problem of the complex geometry of observational data. 
For instance, samples of Lya forests in QSO absorption spectra cover different spatial (redshift) 
ranges for different QSO’s. Any statistic of a sample compiled from many QSO spectra needs, 
at the very least, a complicated weighting scheme. Moreover, the number density of Lya forest 
lines depends on redshift, and therefore, it is very difficult to find a proper weighting scheme for 
methods which depends essentially on a good measure of the mean density of the sample. Despite 
the fact that samples of the Lya clouds are relatively uniform among the various samples of high- 
redshift objects, and Lya forest lines are numerous enough to provide statistical analysis, the power 
spectrum and higher order statistics of Lya forests have not been systematically calculated because 
of the difficulty in trying to compensate for the geometry of the samples. 

Since FFCs in the range being considered are not strongly affected by the data outside the 
range, one can freely extend a sample in spatial range {Di, D 2 ) to a larger range {Dmim Dmax) 
(Dmin < Di,Dmax > D2) by adding zero to the data in ranges {Dmin,Di) and {D2,Dmax)- 
One can then take statistics by simply dropping all FFCs related to the positions in the regions of 
{Dmim Di) and {D 2 , Dmax)- Using this technique, all samples can be extended to a desired range 
{Dmim Dmax), and the geometrically complicated samples are regularized. 

Local spectrum can also be detected by the Karhuen-Loeve (K-L) transform, or proper orthog¬ 
onal decomposition, which is designed for an optimization over the set of orthogonal transformation 
of a covariance matrix. However, finding the K-L eigenvectors of a matrix of order / has com¬ 
puting complexity 0{f^). In addition, the K-L bases are not admissible. Even after finding the 
eigenvectors of a data set, updating the bases with some extra samples will cost an additional 0{f^) 
operations. On the other hand, the DWT can quasi-diagonalize the covariance matrix, and there¬ 
fore, K-L transform can be approximately represented by wavelets which leads to less computing 
complexity (Wickerhauser 1994, Carruthers 1995). 

5 Measures of non-Gaussianity 

5.1 Spectra of skewness and kurtosis 

For a density (contrast) distribution e(x) in a range L = N bins, the non-Gaussianity is usually 
measured by the skewness S and kurtosis K defined as 

1 ^ _ 
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and 


1 _ 

^ “ e(xi))^] - 3. (5.2) 

i=l 

These measures cannot describe any possible scale-dependence of the skewness and kurtosis. 

Using the DWT, one can measure the non-Gaussianity by the spectrum of skewness defined as 
[see eqs.(3.6)] 


O'J - o - 


^ rr^ N “23 rr^ 


Nr 2^-1 

E E [(g,z - W, 


1) ]s) 


and the spectrum of kurtosis as [eq.(3.7)] 


(5.3) 


Cf 


K = — 


. Nr 2^-1 


(5.4) 


where the variance cj| is given by (3.5), and is the number of samples. Unlike eqs.(5.1) and (5.2), 
the spectrum description eq.(5.3) and (5.4) can detect not only the non-Gaussianity of the density 
field, but also the scale of objects contributing to the non-Gaussian behavior. 

Note that eqs.(5.3) and (5.4) differ slightly from usual definition of the skewness or kurtosis by 
the sum over s from 1 to This is because at small j one sample on the range L will yield only 
a small number of ij^i, makeing the calculation of Kj meaningless. For instance, for j = 2, each 
sample gives only two FFCs, i.e. e 2 ,o and 62 , 1 . In this case, ij^i = (e 2 ,o + e 2 ,i)/ 2 , and Kj = —2 if 
Nr = 1, regardless whether the sample is Gaussian or what wavelet function is used. The Kj for 
each sample can not be calculated separately and still have meaningful results at lower j. Generally, 
we have many samples covering the range L. One can compile subsets consisting of Nr samples. 
The number of ey; will then be Nr times larger than one sample making the statistics at small j 
viable. As with the usual definitions of skewness and kurtosis, Sj and Kj should vanish for a 
Gaussian distribution. 


5.2 Distribution of clumps 

The effectiveness in detecting the scales of non-Gaussian objects can be illustrated by clump and 
valley structures. Let us consider non-Gaussian density fields consisting of clumps randomly dis¬ 
tributed in a white noise background. Clump distributions are often used to test methods of detecting 
non-Gaussianity in large scale structure study (Perivolaropoulos 1994, Fan & Bardeen 1995). It has 
been shown that one cannot detect the non-Gaussianity of samples by the one-point probabilities of 
the individual Fourier modes, even when the samples contain only a few independent clumps (Kaiser 
& Peacock 1991). 

To begin, first note that a clump or valley with density perturbation Apc on length scale d at 
position I can be described as 

< X < (/ + l)L/2'^- 

^ 1 0 otherwise 
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where Jc = log 2 (L/d), and the positive and negative signs are for a elump and a valley, respectively. 
If a density field p{x) consist of N randomly distributed clumps and valleys of scale d, so that the 
number density is N/2 '^on average, the field can be realized by a random variable 5p with a 
probability distribution P[5p < X] defined as 


P[x < X] 


0 if X < —Apc 

N/2-^-+^ if - Apc < X < 0 

1 - X/2'^- if 0 < X < Apc 
1 if X > Apc 


(5.6) 


The distribution function 5p of clumps and valleys, fd^p) can then be written approximately as 


The (5(..) on the right hand side of eq.(5.7) denote Dirac (5-functions. The characteristic function of 
the random variable 5pof clumps and valleys is 

roo 2'^= — X X 

(pciu) = j /c(x)e*^“(ix = —-h ^ cos(Actt) (5.8) 


where Ac = Apc/ P- It is very well known that the “standard” measures of skewness and kurtosis, 
i.e. eqs.(5.1) and (5.2), for the distribution (5.6) can be calculated from the characteristic function 
(5.8). The results are 
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d^fpcju) 
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= 0 


(5.9) 


and 


where 


K = 
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duf^ 
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X(Ac 


2Jc 


(5.10) 

(5.11) 


is the variance of the distribution. 

Consider density fields consisting of clumps or valleys randomly distributed in a background. 
In this case, the characteristic function is (/>(n) = (j)c{u)<l)i){u), where (j)b{u) is the characteristic 
function of the background distribution. For a randomly uniform Gaussian background with variance 
a^, the overall variance is 


a 
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iV(Ac)^ 

2Jc 


+ cr 


2 


(5.12) 


and the “standard” kurtosis is 


K = 



( 2^- 

y N{s/ny j 


(5.13) 
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where sjn = Ac/cJb is the signal-to-noise ratio. Eq.(5.13) shows that this distribution beeomes 
Gaussian when s/n is small. 

Samples of elumps and valleys randomly distributed in Gaussian noise baekground were pro¬ 
duced. Figure 5 shows typical fields which contain A) 16, B) 32, and C) 48 clumps and valleys 
randomly distributed in white noise background over the range L = 512 bins. The signal-to-noise 
ratio is s/n = 2.0, and the size of the clumps, d, is randomly distributed from 1 to 5 bins, i.e. the 
average bin width of the clumps is about 3. Figure 5 shows the spectrum of kurtosis of the these 
samples. For comparison, the “standard” kurtosis, K, given by eq.(5.2) is also plotted at the position 
j = 9. 

When distributions are non-Gaussian, a Gaussian variance will no longer be applicable to es¬ 
timate the statistical errors. Instead, the error should be calculated from the confidence level of 
an ensemble of fhe samples. In Figure 5, fhe error bars are fhe 95% confidence for fhe ensemble 
consisfing of 100 realizations. 

Figure 5 shows that the “standard” measure of kurtosis K is generally lower than that given by 
the FFCs, especially when the number of clumps is large. Moreover, the errors in K are much larger 
than that of the FFCs, and iT = 0 is contained in the error bars of K. In other words, K is incapable 
of detecting the non-Gaussianity of these samples. This is expected because the measure (5.2) is 
essentially the same as that of the MFCs. As mentioned in §3, the MFC one-point distributions 
will be Gaussian if the clumps are independent and numerous. On the other hand, the spectrum of 
kurtosis confirms fhe non-Gaussianify of fhe disfribufion, even when fhe number of clumps is as 
large as 48. The specfra also show a peak af j = 6 which corresponds fo fhe mean widfh of fhe 
clumps, ~ 3. 

5.3 Suppression of shot noise 

FSS dafa yield disfribufions sampled wifh a finile number of objecfs. Such sampling leads fo non- 
Gaussian signals. Even if fhe original random field is Gaussian, fhe sampled dafa musf be non- 
Gaussian on scales for which fhe mean number in one bin is small. This is fhe non-Gaussianify of 
shof noise. Any non-Gaussian behavior of fhe densify fields will be confaminafed by fhe shof noise. 

For insfance, in numerical calculafions, a disfribufion e{x) of sampled objecfs is offen binned 
info a hisfogram, and in order fo maximally pick up information from a real dafa sef, fhe bin size 
is faken fo be fhe resolution of fhe coordinate x. However when fhe bin size of fhe hisfogram is 
less fhan fhe mean disfance of neighbor objecfs, fhe value of fhe binned e(x) will fypically be 0 or 1. 
Thus, fhe sample is acfually a d=l clump disfribufion wifh a one poinf disfribufion given by eqs.(5.5) 
and (5.6). If is not a Gaussian disfribufion. Maximally picking up informalion abouf fhe disfribufion 
comes af a cosf of artificially infroducing non-Gaussian behavior. 

If is difficulf fo separafe fhe non-Gaussianifies caused by shof noise and binning wifh fhaf given 
by non-Gaussian sfrucfures. The measure K confains all confribufions fo fhe non-Gaussianify of fhe 
densify field. As a consequence, K will be sampling-dependenf, and nol suifable for confronting 
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Figure 5: Kurtosis spectra for samples consisting of 16, 32, and 48 clumps in a length of L = 512 
bins. The sizes of the clumps, d, are randomly distributed in range of 1 to 5 bins. S/N is 2. 

observation with theory, or for discriminating among models. 

However, non-Gaussian shot noise or binning have different spectrum features from that of 
non-Gaussian samples. For shot noise, the distributions on large scales is a superposition of the 
small scale field. According to the central limit theorem the non-Gaussian shot noise will rapidly 
approach zero on larger scales. In other words, if the mean number in a bin is larger for larger bins, 
the one-point distribution of shot noise will become Gaussian on larger scales. The values of \Kj\ 
for shot noise should rapidly approach zero as j gets smaller, i.e. their kurtosis spectrum should be 
monotonously decreasing with decreasing j. 

To illustrate this point. Figure 6 plots the spectrum of kurtosis for a random white noise sample 
given by 

Xi = xi + {x2 — xi) ■ RAN (5.14) 

where Xi is the position of i-th object, and RAN is random number in (0, 1). We take the size of 
the sample (x 2 — x\) = 64 bins, and the total number of objects is 120. Figure 6 shows that the 
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Figure 6: Kurtosis spectrum of 120 objects randomly distributed in 64 bins. 


non-Gaussianity of the random sample is significant only when the mean number of objects per bin 
is less than 2. 

The non-Gaussianity of the shot noise will significantly be suppressed in the spectrum of FFC 
cumulants with increasing scale. This feature is useful in distinguishing between non-Gaussian 
clumps and shot noise. For instance, one can definitely conclude the existence of non-Gaussian 
structures if the kurtosis spectrum is flat, or contains peaks as in Figure 5. 

6 A preliminary result using DWT: Lyct forests 

6.1 LSS Problems of Lyct clouds 

Lya absorption line forests in QSO spectra come from intervening absorbers, or clouds, with neutral 
hydrogen column densities ranging from about 10^^ to 10^^ cm“^ at high redshifts. Since the size 
of the Lya clouds at high redshift is as large as 100 - 200 h“^ Kpc, and their velocity dispersion is 
as low as ~ 100 km s“^ (Bechtold et al. 1994, Dinshaw et al. 1995, Fang et al. 1996), it is generally 
believed that the Lya clouds are probably neither virialized nor completely gravity-confined, buf 
given by pre-collapsed areas in fhe densify field. If is reasonable fo assume fhaf fhe Lya clouds 
joined fhe clusfering process in fhe universe. 

However, almosf all of fhe resulfs drawn from fwo-poinf correlation funclion analysis of Lya 
foresl lines have failed fo defecf any significanf clusfering on fhe scales less fhan 10 h~^ Mpc (Wey- 
mann 1993), where h is fhe Hubble consfanf in fhe unif of 100 km s“^ Mpc“^. In ofher words, fhe 
power specfrum of fhe Lya clouds is flal, i.e. similar fo whife noise. 
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Theoretically, it is hard to believe that the distribution of the Lya lines is only white noise since 
the Lya clouds should be formed via the same process as that for other objects, i.e., gravitational 
clustering. In fact, contrary to the results of two point correlation function, statistics based on other 
methods show definite deviations of the Lya forests from a uniform random distribution. For in¬ 
stance, the distribution of nearest neighbor Lya line intervals was found to be significantly different 
from a Poisson distribution (Liu and Jones 1990). Based on the Kolmogorov-Smirnoff (K-S) statis¬ 
tic, Lya absorbers have been shown to deviate from a uniform distribution at ~ 3a significant level 
(Fang, 1991). 

It is of fundamental importance to understand why the two point correlation function, which 
is one of the most used ways to detect structure, fails to detect any clustering when other methods 
are showing definite structure. Two possible explanations are 1.) the clustering cannot be detected 
by the two-point correlation function on scale less than 10 h~^ Mpc, 2.) the clustering cannot be 
detected by second order statistics. We look at these in more detail. 

1. If the spectrum of Lya clouds is different from white noise only on large scales, the cluster¬ 
ing will be missed by the two point correlation function. As we know, the mean number density of 
the Lya clouds significantly evolves with redshift. The evolution is generally described as 


dN 

dz 



(1+zr 

0 


( 6 . 1 ) 


where {dN/dz)o is the number density extrapolated to zero redshift, and 7 ~ 2 the index of evo¬ 
lution. No mean density is available for calculating the two-point correlation function. Since the 
correlation amplitudes are less than the uncertainty of mean density of Lya lines on scales larger 
than 5 h~^ Mpc, the two-point correlation will overlook structures on large scales. 

2. If the clustering of Lya clouds cannot be detected by second order statistics, the two-point 
correlation function will certainly be incapable of detecting structures. Using a linear simulation 
of density fields, if is found fhaf fhe simulated clouds indeed show no power of fheir fwo-poinf 
correlafion funclions on scales from abouf 100 km s“^ fo 2000 km s“^ (see Figure 7) (Bi, Ge & 
Fang 1995, hereafler BGF). Since fhe perfurbafion specfrum used for fhe simulafion is nol while 
noise, fhe dislribulion of fhe clouds should confain large scale sfruclures. No power in fhe line-line 
correlations may nof mean fhaf fhe dislribulion is while noise, buf inslead indicale, lhal fhe fwo-poinf 
correlation funclion somelimes is ineffeclive in defecting sfruclures on large scales. 

In order lo clarify clusfering of fhe Lya clouds we need lo determine whelher: A.) does fhe 
power specfrum of Lya clouds slay flal on large scales? B.) is fhe clusfering of fhe clouds deleclable 
by higher order slalislics? 


6.2 Flatness of power spectrum 

Question A can be answered by a DWT specfrum defection. We will look al Iwo popular real 
dala sels of fhe Lya foresls. The firsl was compiled by Lu, Wolfe and Turnshek (1991, hereafter 
LWT). The folal sample conlains ~ 950 lines from fhe speclra of 38 QSO lhal exhibif neilher broad 
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Figure 7: Two-point correlation functions of the BGF simulated sample of Lya forests with W > 
0.16A in the LCDM . The error bars represent 1 a. 


absorption lines nor metal line systems. The second set is from Bechtold (1994, hereafter JB), which 
contains a total ~ 2800 lines from 78 QSO’s spectra, in which 34 high redshift QSOs were observed 
at moderate resolution. To eliminate the proximity effect, all lines with z > Zem — 0.15 were deleted 
from our samples. These samples cover a redshift range of 1.7 to 4.1, and a comoving distance range 
from about i9mm=2,300 h~^Mpc to Dmax =3,300 /i“^Mpc, if go = 1/2. 

For comparison, we also work on the simulation samples of BGF. The density fields in this sim¬ 
ulation are generated as Gaussian perturbations with a linear power spectrum given by the standard 
cold dark matter model (SCDM), the cold plus hot dark matter model (CHDM), and the low-density 
flat cold dark matter model (LCDM). The baryonic matter is assumed to trace the dark matter dis¬ 
tribution on scales larger than the Jeans length of the baryonic gas, but is smooth over structures on 
scales less than the Jeans length. Within a reasonable range of the UV background radiation at high 
redshift, the absorption of the pre-collapsed baryonic clouds is found to be in good agreement with 
the features of Lya forest. In particular, the LCDM gives good fits to: 1) the number density of Lya 
lines; 2) the distribution of equivalent width; and 3) the redshift-dependence of the line number. 

Because different QSO’s forest of real samples cover different redshift ranges, it is not trivial to 
directly find the Fourier power spectrum from an ensemble consisting of such geometrically complex 
forests. However, this complex geometry can easily be regularized by the method described in §3. 
Using this technique, all samples were extended in comoving space, to cover 1024 bins with each 
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bin of comoving size ~ 2.5 Mpc. Thus, all QSO samples were treated uniformly. The Fourier 
spectrum can then be detected by the FFCs. When we are only interested in the shape, not the 
amplitude of the spectrum, the result is independent of the uncertainty of overall mean density. 

The spectra of LWT(fF > 0.36A), JB(FF > 0.32^) and JB(VF > O.lhA) in the entire redshift 
range 1.7 < z < 4.1 are plotted in Figures 8a, b, c, respectively. For comparison. Figure 9 shows the 
spectra of LWT(1F" > 0.361), JB(1F" > 0.321), SCDM and CHDM with W > 0.321. The error 
bars in the Figures 8, 9 come from the average over the samples of QSO’s absorption spectrum. The 
errors at large scale are about the same as that on small scales. This means that the spectrum can 
uniformly be detected on scales as large as about 80 h~^ Mpc by the FFCs. 

All spectra in Figures 8 and 9 are rather flat on the range of log k > —1, i.e. on scales less than 
5 h~^ Mpc, and slightly increases with scale in a range of 10 - 80 h“^ Mpc (log/c < —1). These 
results are consistent with the result of no correlation power on scales less than 10 h~^ Mpc. The 
flatness of these spectra can be described by the power law index a, which is found by fitting the 
observed spectra with power law P(k) oc /c". The results are: a = —0.26 ± 0.42 for the LWT 
{W > 0.361), a = -0.23 ± 0.41 for the JB (W > 0.321), and a = -0.23 ± 0.37 for the JB 
{W > 0.161). The two independent data sets, LWT and JB, show almost the same values of the 
index a. This strongly implies that this feature is common among the Lya forests. 

One can conclude that although the power spectrum seems to increase on large scales, as dis¬ 
cussed in §6.1, the values of a and its errors show that the distribution of the real sample is consistent 
with a flat spectrum (a ~ 0) on scales less than 100 h~^ Mpc. Moreover, the power spectra of the 
simulated samples of the SCDM, CHDM and LCDM also are quite flat, i.e. a = —0.93 ± 0.15 
for SCDM {W > 0.161) and a = —1.06 ± 0.08 for CHDM {W > 0.161). It appears that the 
difference in the results drawn from the 2-point correlation function and other statistical methods is 
not due to the possibility that the structures occur on large scales. 

6.3 Kurtosis on large scales 

From the flatness of the spectrum, it should not be concluded that the distribution of Lya clouds must 
be white noise. Instead, this may indicate that the power spectrum and two-point correlation function 
are not suitable for describing the statistical features of the system being considered. Statistically, 
it is essential to measure non-Gaussianity in order to detect the clustering of density field with flat 
power spectrum. 

This point can be illustrated by the simulation sample BGF As shown in Figure 7, the power 
spectrum of the SCDM (W> 0.361) is flat. The spectrum is almost the same as that of a random 
sample generated by eq.(5.14) with the redshift-dependence of the number density of the Lya lines 
included. That is, in each redshift range Az = 0.4 the number of lines in the random sample is taken 
to be the same as the SCDM sample. 

However, Figure 10 shows that the amplitude of the kurtosis spectrum for the SCDM sample 
is systematically larger than the random sample. Recall that the error bars in Figure 10 do not 
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Figure 8: 1-D spectra P{k)j (diamond) and (star) of a.) LWT Lya forest samples with 

width > 0.36 A; b.) JB with W > 0.32A and c.) JB with W > 0.16A. For clarity, the points 
pvar^k'^j plotted at log k + 0.05. 
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Figure 9: 1-D spectra P{k)j of LWT Lya forest samples with width > 0.36 A (circle); JB samples 
with W > 0.32 A (diamond); SCDM with W > 0.32A (triangle); CHDM with W > 0.32A 
(square). The spectrum is given by data in the entire redshift range 1.7 < z < 4.1. k is in unit 
h Mpc“^. The error bars are obtained from the average over the samples of QSO’s absorption 
spectrum. For clarity, the four types of points are slightly shifted from each others along the k axis. 



i 


Figure 10: Kurtosis spectra of BGF sample of SCDM model and random data (solid line), respec¬ 
tively. Dot line is for Kj = 0. 
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Figure 11: Kurtosis spectra for JB data (W> O.IGA), and random sample (dashed line). 

represent the 1 a Gaussian errors, but the 95% confidence level from the ensemble of the samples. 
The difference between the spectra of the SCDM and random sample is significant. The kurtosis of 
the random sample is completely given by the shot noise. As expected, the kurtosis of the random 
sample is monotonously decreasing with decreasing of j. 

In trying to apply this technique to observational data, one immediately encounters a serious 
problem. In order to compute the skewness and kurtosis spectrum of real data, subsets of the samples 
are needed for eqs.(5.3) and (5.4). Unlike the simulated samples, where as much data as needed can 
be generated, the available real data are limited. There are only = 43 samples (forests) for LWT 
and 78 for JB. In order to effectively use this data, M < files from among fhe complete Nt 
samples are chosen fo form a subsef. Various combinafions of fhe subsefs M are fhen combined fo 
form an ensemble. To invesfigafe fhe effecl of differenl combinafions, fhe subsefs M are formed by 
varying fhe fofal number of files chosen from fhe parenf disfribufion. Nr, as well as changing fhe 
order in which fhe individual files are selecfed. If is found fhaf fhe skewness and kurtosis calculafed 
from fhese M-file ensembles are very sfable until M confains as few as 7 or 8 files, i.e. until only 
approximately 5% of the total lines remain in the subset. The 95 % confidence intervals are fhen 
esfimafed from fhe ensembles. 

Figure 11 shows fhe kurtosis specfra of fhe JB dafa and ifs random sample. The difference 
befween JB sample {W > O.ltiA) and random sample is more significanl fhan fhe difference be- 
fween SCDM and random dafa. The amplifudes of fhe kurtosis specfrum are much higher fhan fhaf 
of random dafa on scales larger fhan 10 h^^Mpc. Figures 12 and 13 show fhe skewness and kurtosis 
specfra of fhe LWT (W > 0.36A) and JB (W > 0.32A) dafa sefs. Again, fhe fwo independenf 
dafa sefs show fhe same sfafisfical feafures. One can conclude fhaf fhe clustering of fhe Lya clouds 
can only be clearly described by higher (fhan 2) order sfafisfics. The disfribufion of fhe clouds is 
non-Gaussian on all scales being defected, i.e. less fhan 80 h~^Mpc. 
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Figure 12: Skewness spectra for samples of LWT (W> 0.36A) and JB (W> 0.32A). 



Figure 13: Kurtosis spectra for samples of LWT (W> 0.36A) and JB (W> 0.32A). 
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Figure 14: A section of the wavelet reconstruction of density fields for the BGF sample with W > 
O.lhA. A. the original (or scale j = 10) line distribution. B. C. and D. the reconstructed fields for 
scale j = 9, 8, and 7, respecfively. The disfance is in unite of h“^Mpc. 

6.4 Structures identified by multiresolution analysis 

Direcf idenlificalion of sfrucfures is imporfanf in fhe description of LSS, because if allows us fo 
“see” fhe clustering of objecf’s disfribufion and fo compare fhe clusters of simulated samples wifh 
observations. 

For weakly clusfered disfribufions, if is difficull, even impossible, fo disfinguish fhe clusters 
caused by dynamical inferacfion wifh fhaf of random fluclualions. Many clusfers are idenlified af 
fhe 2 or 3 (T confidence levels. If fhe idenlificalion is done on a large number of realizalions, some 
3cj evenls may be due fo random processes. We will discuss a melhod of idenlifying sfrucfures by 
a DWT mulliresolufion, in parlicular, how fo handle fhe random fluclualions in order fo gel robusl 
conclusions even when fhe idenlified clusters are contaminated by random fluctuations. 

The DWT multiresolution is based on eqs.(2.20) and (2.48). For a given distribution e(a:), 
one can decompose it into e^(x) on various scales. As an example, Figure 14 shows the original 
distribution of the LCDM {W > O.ltiA) (j = 10), and the result of the reconstruction on j = 9, 8 
and 7, corresponding to scales (in comoving space) of about 5, 10 and 20 h~^ Mpc, respectively. 

The peaks in the field e^{x) correspond fo high densily regions, and are possible clusters on 
scale j. Various melhods of slruclure idenlificalion are designed fo pick oul such high densily 
regions. Since fhe MFCs, ej^i, represenl fhe densily of fhe field al position I and on scale j, we 
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can directly identify possible clusters by picking out the peaks of the MFC distribution. Figure 15 
shows a section of the MFC reconstructed distribution for the forest of QSO-0237. The error bars in 
Figure 15 are 1-a calculated from 100 uniformly random samples which match the number of lines 
in each redshift interval, Az = 0.4, of the parent sample QSO-0237. The peaks of ej^i distribution 
are identified as clusters of Lya clouds on scale j and position (/). The strength or richness of the 
clusters can be measured by i? = ej^i / a. 

Using this identification scheme, one can count the number strength distribution of clusters, 
Nj{> R), which is defined as fhe fofal number of fhe clusfers on scale j wifh sfrengfh larger fhan a 
given R. For fhe BGF sample of fhe LCDM (W < O.lOA), fhe Nj{> R) on scales j = 9, 8 and 7 is 
ploffed in Figure 16, in which fhe error bars are given by fhe average among fhe 20 BGF simulated 
samples. For the JB {W > 0.16A), the results of Nj{> R) are shown in Figure 17. 

The advantage of using the DWT to identify clusters is the ability to systematically study 
clusters on all scales and with various strengths. It is easy to estimate the number of clusters which 
are due to fluctuations, because the mother functions (pj^i are orthogonal with respect to I and each 
position I corresponds to an independent realization. For instance, when j = 9, the MFCs ej^i are 
detected from 2® = 1024 realizations. In this case the number of i? < 2 events for white noise 
should be about 1024 x 0.03 ~ 30. The shape of the function Nj{> R) is more important. In 
the case of white noise, the number strength function normalized at Rq, i.e. Nj{> R)/Nj{> Rq), 
should be erfc(i?)/erfc(i?o), erfc(x) being the complementary error function. Again, for white 
noise, the number-strength distributions on different scales should be satisfied fhe relation Nj{> 
R) ~ R). 

From Figures 16 and 17, if is easy fo see fhaf for bofh fhe BGF and JB dafa, we have Nj{> 
R)/Nj{> 2) > erfc(ii)/erfc(2), and Nj{> R) > 2^Nj-n{> 2) for all R > 2. Bofh samples 
confain clusfers nof originating from random flucfuafions. 


6.5 Discrimination among models 

Since fhe DWT can uniformly calculafe sfalislical quantities on differenl scale and differenl orders, 
if provides a more powerful fool fo dissecf models. In fhe lasf few sections, we have performed fhis 
dissecfion on fhe BGF samples. As menfioned in §6.1 fhese samples are linear. Nonefheless using 
fradifional fesfs, such as fhe number densify, fwo-poinf correlafion function, fhe Gunn-Peferson effecl 
efc, fhey show fhe same behavior as fhe observational dafa. However, fhese fesfs involve af mosf, 
second order sfalisfics. If fhe Lya clouds have undergone non-linear clustering, we should expecf 
fhaf fhe BGF samples should nof compare wifh fhe real dafa when compared using higher order fesfs. 
The resulfs of §6.2 and 6.3, do indeed, confirm fhis. Thai is, in ferms of second order sfafisfics, fhe 
BGF dafa behaves like fhe real dafa, buf for higher order fesfs, fhe BGF shows significanf differences 
from fhe LWT and JB dafa sefs. 
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Figure 15: A section of the MFC coefficients of QSO-0237 Lya forest with line width W > O.lhAa 
BGF and the corresponding random samples. A. B. and C. are for scales j = 9, 8 and 7, respectively. 
The distance is in units of /i“^Mpc. The error bars represent one a given by 100 random samples. 
It is interesting to note that the j = 9 clusters shown around 2465-2480 h~^ Mpc also appear as 
j = 8,1 clusters at the same place. That is, this structure appears at all three resolution scales. On 
the other hand, the structure appearing at 2505-2520 Mpc only appears on the scales j = 9 and 8, 
but not on larger scales. 
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Figure 16: The number of elusters Nj{> R) identified from BGF samples, where R is the riehness 
of the elusters in units of a. A. B. and C. are for seales j = 9, 8 and 7, respeetively. The error eomes 
from average among 20 BGF samples. 


For instanee, the ratios of Nj{> R)/Nj/{> R) and Nj{> R)/Nj{> R') are very dependent 
on non-Gaussian (i.e. higher order) behavior (§6.3) and so it ean also be used as a diseriminator. 
From Figures 16 and 17, one ean find that Nq{> 3.5)/Nq{> 2) = (7 ± 1.5)% for BGF sample, 
but Ng{> 3.5)/Ng{> 2) = 23% and 28% for the LWT and JB data, respeetively. The differenee 
of Nj{> 4)/Nj{> 2) between real and simulated sample is more remarkable. Almost no i? > 4.5 
elusters are deteeted in BGF samples, while they exist in the real sample. 

We ean reeast these results by looking at the redshift-dependenee of the number of elusters. 
As in eq.(6.1), we analyzed the evolution of the number density of elusters on seales j = 9, 8, 
and 7. The results are plotted in Figure 18. Figure 18a is for LWT {W > 0.36A), and 18b for JB 
{W > O. 32 A). The two data sets showed, onee again, the same features. The top eurves in Figures 
18a and b are the original results (number density of Lya lines) of LWT and JB. This number density 
inereases with redshift. However, the number densities of j = 9, 8 and 7 elusters show an opposite 
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Figure 17: Number of elusters Nj{> R) identified from the JBW> 0.16A, where R is the riehness 
of the elusters in units of a. A. B. and C. are for seales j = 9, 8 and 7, respeetively. 

evolution, deereasing with inereasing redshift. That is, large seale struetures traeed by Lya lines 
were growing from the era z = 4 to 2. This evolutionary feature was never revealed before the 
DWT analysis. 

We did the same analysis as above with the BGF samples. The results are shown in Figure 
19. Figure 19a is for BGF (W > O.lhA), and 19b for JB (W > O.lhA). We note first that the 
evolution of the number densities of j = 9, 8 and 7 elusters have the same trend as real data: dN/dz 
is deereasing with redshift. If we fit the eurves dN/dz with the power law eq.(16), both the BGF 
and JB give about the same index 7. This shows that the linear approximation is eorreet to model 
the evolutionary trend. 

Yet, the values of dN/dz for the j = 9, 8 and 7 elusters of the BGF data are less than JB’s 
results by a faetor of about 5. Considering that both LWT and JB have about the same number 
density of Lya lines (top eurves of Figures 19a and b), the different number density of their elusters 
should not be fully given by the uneertainty of eurrent observations. The diserepaney is due to the 
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Figure 18: clN/dz vs (1 + z) of A. LWT data of PF > 0.36^, B. JB data of IF > 0.32^. The top 
eurves are given by original Lya lines. The lower eurves are from the identified elusters on seales 
j = 9, 8 and 7, respeetively. 

faet that the linear simulation underestimated elustering on large seales. 

7 Miscellaneous topics 

Topies ineluded in this seetion should not be eonsidered less important, but rather, less developed. 

1. Scale-dependence of bias 

Bias is introdueed to deseribe how the spatial distribution of objeets like galaxies and galaxy 
elusters is related to that of the underlying mass. Theoretieally, linear bias b for a given type of 
objeet is defined as 

^{x^object — be(^x')mass ■ C^-l) 

Applying fhe DWT eq.(7.1) gives 


^j,l)objects — 


(7.2) 


Considering fhe densify field is homogeneous, bofh {ej^i)objects and {ej^i)mass should be independenf 
of index 1. In eq.(7.2), one ean replaee ey; by ifs average ■ Generally, {ej^i)objects and 

{(-j,i)mass are j-dependenf, and Iherefore fheir ratio is also seale-dependenf. This is frue even for a 
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Figure 19: dN/dz vs (1 + z) of A. the BGF sample of LCDM with W > O.IGA, B. JB data of 
W > O.IGA. Both top eurves are given by original Lya lines. The lower eurves are from the 
identified elusters on seales y = 9, 8 and 7. The amplitudes of the BGF eurves of j = 9, 8 and 7 are 
mueh lower than the eorresponding amplitudes of the Beehtold data. 


linear bias, i.e., the faetor b is generally seale-dependent. As a eonsequenee, it would be better to 
define fhe bias faefor of fhe disfribufion of objeef 1 wifh respeef fo objeef 11 by FFCs as follows 


(EzIgvI)/ 
' (EzIgvI)// 


(7.3) 


where fhe subseripfs I and II denote fhe FFCs for objeefs 1 and 11, respeefively. 

For insfanee, figures 17 and 20 show fhaf [A" 9 (> 2)/Ns{> 2)]o.32 > [A^ 9 (> 2)/A"8(> 2)]o.i6. 
buf [A"8(> 2)/Ny{> 2)]o.32 < [A8(> 2)/A'7(> 2)]o.i6? where fhe subseripfs 0.16 and 0.32 denote 
fhe elouds of FF > 0.16A and W > 0.32A of JB samples. This resulf is nol eonsisfenf wifh bs = 67 , 
and Iherefore, fhe bias faelors bj of VF > 0.16A elouds wifh respeef fo FF > 0.32A are probably j- 
dependenf. Of eourse, fhe uneerfainfy of fhe eurrenf Lya foresl dafa is sfill large. The j-dependenee 
of bias is only a very preliminary resulf. However, if already shows fhaf fhe fealures of a bias faefor, 
like seale-dependenee, ean properly be deseribed by fhe FFCs. 

Bias essentially is one of fhe environmenf-dependenf effeels. On seales equal fo or larger fhan 
galaxy elusfers, fhe key parameter of fhe environmenf should be fhe loeal baekground density. The 
DWT provides a uniform proeedure fo measure fhe loeal baekground densify on various seales. If 
ean also Iransform a poinf-like disfribufion fo MFCs on various seales, and reeonsfruef fhe density 
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Figure 20: Number of elusters Nj(> R) identified from the JB fF > 0.32A, where R is the riehness 
of the elusters in units of a. A. B. and C. are for seales j = 9, 8 and 7, respeetively. 

field on various seales. Environmental effeets ean be revealed by the eorrelation with MFCs. The 
eross-eorrelation between MFC and FFC is partieularly important, sinee it deseribes how elustering 
depends on environment. 

2. Extraction of fractal 

Observational data indieates that the elustering of galaxies seems to be seale-free. On the 
other hand, a pure fraetal distribution eontradiets with the observed angular eorrelations on large 
scales. A complete picture of the FSS may need to incorporate fractal structures into a homogeneous 
background (Fuo and Schramm 1994). However, on which scale will the fractal features end, and 
turn to a homogeneous distribution? This issue is a subject in debate. Statistically, we need a method 
of extracting fractal structures from a homogeneous random background, and detecting the scale of 
the fractal ending. Since the bases for the DWT are self similar, the choice of the DWT for fractal 
analysis is a natural one. 
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To illustrate this point, we eonsider a Gaussian field with small self-similar (multifraetal) strue- 
tures added. In a eonventional multifraetal analysis a distribution is first expressed as, for example, 
eq.(2.10). The sealing behavior is deteeted by 

(7.4) 

i 

For a Gaussian baekground, TcaussianiQ) = 7 — 1- Sinee the amplitudes relleet absolute values, 
this sealing approaeh is quite insensitive to the small self-similar eomponent when the large Gaussian 
baekground dominates. On the other hand, the FFCs ij^i foeus on differenees between neighboring 
parts of the distribution, and therefore, the smooth Gaussian baekground drops out for small seales I 
and the FFCs are only determined by the selfsimilar eomponent. Henee, the fraetal is easily extraeted 
by 

(7.5) 

i 

The fraetal eomponent ean be deteeted if P{q) is different from {2q — 1). For large j Gaussian 
behavior take over again. One may then be able to find fhe seale af whieh fhe disfribufion fransforms 
from a fraefal fo a Gaussian disfribufion. 

3. Hierarchical clustering and scale-scale correlations 

Besides sfafisfieal deseripfions, fhe DWT represenfafion eould be valuable for dynamieal sfudy. 
Dynamies is eerfainly represenfafion-independenf. Dynamieal solutions found in Fourier represen- 
fafion ean be found by a DWT mode expansion. Praefieally, however, differenl represenfafion are 
nof equivalenf beeause we eannof ealeulafe fhe mode-mode eoupling (or eorrelafions) on all orders, 
buf only on a few lower orders. These lower orders are differenl for differenl mode deeomposilions. 
In olher words, differenl bases will reveal differenl aspeels of fhe LSS dynamies. Differenl seleelion 
of fhe bases funelions is neeessary lo sfudy differenl aspeels of fhe LSS. To illuslrale Ihis poinf, lei’s 
sfudy so-ealled merging proeess in slruelure formation. 

In fhe slandard seenario of slruelure formation, larger dark mailer halos are generally eonsid- 
ered lo form hierarehieally from Ihe eluslering of smaller halos. One ean Iraee Ihe histories of dark 
mailer halos on Ihe assumption lhal Ihere is a relation belween Ihe halos on differenl seales. In order 
to exploil Ihe DWT’s abilily to exlrael information of Ihe merging proeess, several toy models of lex- 
lures have been investigated. These inelude Ihe p-model, p-model wilh random branehing, a-model, 
QCD-molivaled easeade model, ele. For inslanee, similar to Ihe Bloek model of merge frees (Cole, 
1991), Ihe p-model assumes lhal mass M in a L-lh order (seale) objeel is splil unequally into Iwo 
L/2-lh objeels. The proeess repeals, easeading down Ihrough seales of lenglh L/4, L/8,.. till Ihe 
seale L/n. The final dislribulion generated by p-model is a number density field of L/n-lh objeels. 
The Iwo-poinl eorrelalion funelion of Ihe L/n-lh objeels shows Ihe same power-law behavior as 
usually found in galaxy and olher LSS samples. 

However, Ihe Iwo-poinl eorrelalion based on monoseale expansion does nol lake Ihe hierarehi- 
eal proeess into aeeounl. In fael, Ihe power-law behavior of Ihe Iwo-poinl eorrelalion ean be realized 
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in many models. It is not an optimal choice to detect the merging trees. A better choice is the 
correlation between FFCs on different scales. One can show that for the p-model, the scale-scale 
correlations of FFCs is completely compressed into the diagonal (Greiner, Lipa, & Carruthers 1995). 
That is, the scale-scale correlation between FFCs can provide information on how larger structures 
with substructures living inside are formed. In this sense, the higher order FFC correlation are 
sensitive to the merging dynamics of the LSS. 

8 Outlook 

The wavelet transform is a very new technique in physics. As with many other mathematical meth¬ 
ods being introduced into physics, the problem in the first phase of development is a lack of feeling 
for the physical meaning of the relevant mathematical quantities. In the mind of physicists, the 
Fourier coefficient is not only a result of a mathematical transform, but directly “seen” as an excited 
mode, a “particle” with a given momentum, the energy of the model etc. On the other hand, the 
MFCs and FFCs are far from becoming accustomed notions. 

This being the case, we should first try to search for a better understanding of the physical 
meaning of the wavelet coefficients. Fortunately, the wavelet community in different fields of 
physics has shared much wifh each ofher. The DWT is being rapidly infroduced in physics in¬ 
cluding furbulence (Farge 1992), mulfiparficle dynamics (Greiner el al. 1996, Huang el al. 1996), 
disordered solid slafe systems (Kanlelhardl, Greiner & Roman 1995) and quanlum algebras (Ludu & 
Greiner 1995). In Ihe conlexl of LSS, Ihe wavelel fransform cerlainly opens new possibililies. The 
resulls reviewed here already demonslrale some of Ihe superior properfies of Ihe wavelel fransform 
in revealing Ihe physics of LSS. As we become more familiar wifh Ihe technique and slarl lo build 
inluilion abouf Ihe MFCs and FFCs, Ihe wavelel fransform should reveal yel more informalion aboul 
LSS. 
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Appendix 


A Relationship between Fourier coefficients and FFCs 


From eq.(2.41), we have 

^j,l+K = 


POO 

' —oo 

POO 

' —oo 


e{x)i^j^i+K{x)dx 


e(x) 


i] 


1/2 


ip{2^ x/L — I — K)dx 


2^ 


1/2 


e{x' + 2-^KL) ( i>(2^xJjL - l)dx' 


(Al) 


here we make a ehange of variable x' = x — 2 ^KL. Therefore, when 2 ^K is an integer, i.e. 
K = 2^m, eq.(Al) is 


2^' 


1/2 




e{x' + mL) ( ^ 1 jL — l)dx' = ij^i 


whieh shows that the FFC, ;, are periodie in I with period 2T 

Substituting the wavelet expansion of e(x), i.e. eq.(2.49) into eq.(2.51), we have 


ii 

Using eq.(A2), eq.(A3) beeomes 
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This is eq.(2.58). An alternative form, whieh uses the Fourier transform of the basie funetion 
rather than Tpj^i{x), ean be derived from eq.(A4) as follows 


1 

jY.Y.hl 

^ j=0 Z=0 

oo 2t-l /2i\ 

zEE ij^ij f—j i;{2^x/L-l)e-^^^^^/^dx 


^r). — 


1 


i=o z=o 

oo 2t-l 


' —OO 

- 1/2 


= iEE 

j =0 z=o 


/*CXJ 


oo 2t-l , s 1/2 

= S S ( W) 

/=0 Z=0 

This is eq.(2.59). 


(A5) 


B Parseval theorem of the DWT 


From the expansion (2.49) we have 

rL 


/ \e{x)\^dx = V V 

Jo JO 


jj'=0 l,l'=—c>o 

Considering the periodicity (A2), eq.(Bl) can be rewritten as 


(Bl) 
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{x/L — m ) — I — 2^rrC^)^{2^ {x/L — m') — l')dx 
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From 5jj>, j' should be equal to j, and then I' < 2^. Therefore, v requires m” = 

I = I'. We have then the Parseval theorem (4.3). 
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